Corpus-Based Chinese-Korean Abstracting Translation System

نویسندگان

Jun-Jie Li

Key-Sun Choi

چکیده

A Corpus-Based Chinese-Korean Abstracting Translation System is designed and imple mented. Firstly, a text indexing method called Natural Hierarchical Network(NHN) is intro duced, and then a Corpus-Based Word Seg mentation algorithm is developed with the segmentation correctness of 98% for open test. Based on a words weighting function and a sentence importance weighting function which can dynamically calculate the importance of words and sentences by using the word fre quency both in corpus and context, word length, sentence length and so on, an abstract ing system is implemented to produce ab stracts of texts in deferent languages and do mains by any abstracting rate. Experiments show that generally abstracts produced by 10% to 20% abstracting rates can cover 90% of the important sentences of the input texts. Finally, combines with an Example-Based Chinese-Korean Machine Translation System, the generated abstracts are translated into tar get language with the correctness of transla tion of more than 70% by the important words oriented machine translation strategy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Chinese POS Decision Method Using Korean Translation Information

In this paper we propose a method that imitates a translation expert using the Korean translation information and analyse the performance. Korean is good at tagging than Chinese, so we can use this property in Chinese POS tagging. Keyword : machine translation, part of speech tagging, corpus Introduction Previous POS(Part Of Speech) tagging methods of Chinese can be largely classified into 2. O...

متن کامل

The Use of Second-Person Reference in Advertisement Translation with Reference to Translation between Chinese and English

This research aimed to review the use of second-person reference in advertisement translation, work out the general rules, and provide guidance to translators. Using second-person reference is common in the advertising discourse. Addressing audiences directly involves their attention and in this way enhances their memorization of the advertised message. Second-person reference can be realized v...

متن کامل

Korean-Chinese-Japanese Multilingual Wordnet with Shared Semantic Hierarchy

A Chinese-Japanese-Korean wordnet is introduced. It is constructed based on a shared semantic hierarchy that is originated from NTT Goidaikei (Lexical Hierarchical System). Korean wordnet was constructed through the semantic category assignment to every sense of Korean words in a dictionary. Verbs and adjectives’ senses are assigned to the same semantic hierarchy as that of nouns. Each sense of...

متن کامل

Bayesian Learning of Tokenization for Machine Translation

Training a statistical machine translation system starts with tokenizing a parallel corpus. Some languages such as Chinese do not incorporate spacing in their writing system, which creates a challenge for tokenization. Morphologically rich languages such as Korean and Hungarian present an even bigger challenge, since optimal token boundaries for machine translation in these languages are often ...

متن کامل

Unsupervised Tokenization for Machine Translation

Training a statistical machine translation starts with tokenizing a parallel corpus. Some languages such as Chinese do not incorporate spacing in their writing system, which creates a challenge for tokenization. Moreover, morphologically rich languages such as Korean present an even bigger challenge, since optimal token boundaries for machine translation in these languages are often unclear. Bo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Corpus-Based Chinese-Korean Abstracting Translation System

نویسندگان

چکیده

منابع مشابه

A Chinese POS Decision Method Using Korean Translation Information

The Use of Second-Person Reference in Advertisement Translation with Reference to Translation between Chinese and English

Korean-Chinese-Japanese Multilingual Wordnet with Shared Semantic Hierarchy

Bayesian Learning of Tokenization for Machine Translation

Unsupervised Tokenization for Machine Translation

عنوان ژورنال:

اشتراک گذاری